Towards Debugging Sentiment Lexicons
نویسندگان
چکیده
Central to many sentiment analysis tasks are sentiment lexicons (SLs). SLs exhibit polarity inconsistencies. Previous work studied the problem of checking the consistency of an SL for the case when the entries have categorical labels (positive, negative or neutral) and showed that it is NPhard. In this paper, we address the more general problem, in which polarity tags take the form of a continuous distribution in the interval [0, 1]. We show that this problem is polynomial. We develop a general framework for addressing the consistency problem using linear programming (LP) theory. LP tools allow us to uncover inconsistencies efficiently, paving the way to building SL debugging tools. We show that previous work corresponds to 0-1 integer programming, a particular case of LP. Our experimental studies show a strong correlation between polarity consistency in SLs and the accuracy of sentiment tagging in practice.
منابع مشابه
MHSubLex: Using Metaheuristic Methods for Subjectivity Classification of Microblogs
In Web 2.0, people are free to share their experiences, views, and opinions. One of the problems that arises in web 2.0 is the sentiment analysis of texts produced by users in outlets such as Twitter. One of main the tasks of sentiment analysis is subjectivity classification. Our aim is to classify the subjectivity of Tweets. To this end, we create subjectivity lexicons in which the words into ...
متن کاملOn the Impact of Seed Words on Sentiment Polarity Lexicon Induction
Sentiment polarity lexicons are key resources for sentiment analysis, and researchers have invested a lot of efforts in their manual creation. However, there has been a recent shift towards automatically extracted lexicons, which are orders of magnitude larger and perform much better. These lexicons are typically mined using bootstrapping, starting from very few seed words whose polarity is giv...
متن کاملA Large Scale Arabic Sentiment Lexicon for Arabic Opinion Mining
Most opinion mining methods in English rely successfully on sentiment lexicons, such as English SentiWordnet (ESWN). While there have been efforts towards building Arabic sentiment lexicons, they suffer from many deficiencies: limited size, unclear usability plan given Arabic’s rich morphology, or nonavailability publicly. In this paper, we address all of these issues and produce the first publ...
متن کاملGood News or Bad News: Using Affect Control Theory to Analyze Readers' Reaction Towards News Articles
This paper proposes a novel approach to sentiment analysis that leverages work in sociology on symbolic interactionism. The proposed approach uses Affect Control Theory (ACT) to analyze readers’ sentiment towards factual (objective) content and towards its entities (subject and object). ACT is a theory of affective reasoning that uses empirically derived equations to predict the sentiments and ...
متن کاملInducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora
A word's sentiment depends on the domain in which it is used. Computational social science research thus requires sentiment lexicons that are specific to the domains being studied. We combine domain-specific word embeddings with a label propagation framework to induce accurate domain-specific sentiment lexicons using small sets of seed words. We show that our approach achieves state-of-the-art ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015